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Abstract. Adaptive Kinetic Monte Carlo combines the simplicity of Kinetic 
Monte Carlo (KMC) with a Molecular Dynamics (MD) based saddle point 
search algorithm in order to simulate metastable systems. Key to making 
Adaptive KMC effective is a stopping criterion for the saddle point search. In 
this work, we examine a criterion, recently appearing in [6], that is based on the 
fraction of total reaction rate found instead of the fraction of observed saddles. 
The criterion uses the Eyring-Kramers law to estimate the reaction rate at the 
MD search temperature. We also consider a related criterion that remains 
valid when the Eyring-Kramers law is not. We examine the mathematical 
properties of both estimators and prove their mean square errors are well 
behaved, vanishing as the simulation continues to run. 


1. Introduction 

An outstanding problem in theoretical materials science and chemistry is how to 
reach laboratory time scales of microseconds (10“® s) and longer using Molecular 
Dynamics (MD) based models which resolve the atomistic time scale of femtosec¬ 
onds (10“^® s). Much of this scale separation is due to the presence of metastable 
regions in the configuration space of the system. In such regions, often defined by 
local minima of an energy landscape, the system stays close to a particular config¬ 
uration, such as a local minima, before crossing into some other metastable region 
associated with a different configuration. Consequently, during much of a direct 
MD simulation, the system is close to one metastable region or another. It exhibits 
dynamics akin to a continuous time random walk on the set of metastable states, 
with comparatively long waiting times. 

Since much of the physical significance of these systems is characterized by the 
sequence of visited metastable states and the time spent in each, there have been 
a variety of efforts to systematically coarse grain the MD trajectory into a more 
computationally efficient continuous time random walk. A.F. Voter has proposed 
three methods. Parallel Replica Dynamics, Hyperdynamics, and Temperature Ac¬ 
celerated Dynamics, which can overcome metastability through intelligent usage of 
the primitive Langevin dynamics, [I41I16] . In recent years, significant effort has 
been made to understand and quantify the approximations in these methods and 
extend their applicability, [THHlIS HTTlfTillTS] . 

Another approach to the problem is Kinetic Monte Carlo (KMC), and this will be 
the focus of this work. Let us assume our system is governed by a potential energy 
V{x), X € at inverse temperature /3. Furthermore, we assume that we have 
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partitioned configuration space into an at most countable set of metastable states, 
associated with local minima rrii of V. The system can go from metastable 
state i to metastable state j if there is a saddle point, s^-, of V{x) joining fli and 
rij. For conciseness, we will assume there is a single saddle point joining two given 
adjacent metastable states, though, in general, there may be multiple pathways. 

In traditional KMC, before a simulation is run, one must identify the metastable 
states, their connectivity {i.e., which ones are joined by saddle points), and the 
reaction rates of each such connection. Given all of this information, KMC is very 
cheap to simulate. A single random number is generated and used to select one of 
the possible reactions, the system migrates into the new metastable region, and the 
algorithm repeats. 

Unfortunately, such complete details of the metastable states and their con¬ 
nectivity are, a priori, unavailable in all but the simplest low dimensional sys¬ 
tems. This has motivated the development of Adaptive Kinetic Monte Carlo 
(AKMC), j6l[T7l[T8]. In AKMC, the system starts in some metastable region 11^. 
Saddle points associated with Vli are then sought via a saddle point search algorithm 
that successively finds Sij. Reaction rates for each such saddle can be estimated by 
the Eyring-Kramers law [8]: 

(IT) kij = Qij exp [-P{V{sij) - V{mi)] , 

where, writing Ai for the sole negative eigenvalue of V^U(sy ), 

I All / AetV^V{mi) 

TT y detV^U(sy) 

Once a sufficient number of saddles associated with have been identified, the 
problem is treated by using traditional KMC with the thus far identified reactions 
and their rates; this process then repeats in the next metastable region. Two things 
are needed to proceed with AKMC: 

(1) A saddle point search algorithm; 

(2) A stopping criterion. 

In this work, we will consider the question of the stopping criterion, provided our 
saddle point search algorithm satisfies certain assumptions. Our analysis will focus 
on estimators similar to the one introduced by Chill & Henkelman in [5]. We call 
these Chill type estimators. 

In [5] , the authors searched for saddle points out of each metastable state using 
high temperature MD. For concreteness, consider the Brownian dynamics in 

(1.2) dXt = -VV{Xt)dt + ^/2|3-^dWt. 

The aim is to model the dynamics at low temperature /3 = /3^°. Starting at Aq G 
integrate (EH at a higher temperature /3 = {i.e., /3'° > /3''*) until the trajectory 

leaves Using the higher temperature /3^‘ allows an escape to occur more quickly. 
After the trajectory leaves one of the saddle points is identified with this 
pathway using, for instance, the nudged elastic band method liiinj, and the low 
temperature reaction rate is computed using (II. ip with /3 = /3*°. This is then 
repeated, with a new initial condition chosen in 17^. Throughout, the cumulative 
simulation time is recorded. 

Other saddle point search algorithms have been proposed, including the Dimer 
method and the string method mm- In our analysis, the key property that we 
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need to hold true for all of our search methods is the following. Let 
(1.3) Nij(t) = Number of times saddle Sij has been found by time t. 


Then for fixed i, during a saddle point search, the Nij(t) are independent, with 
respect to j, Poisson processes. We prove below that this holds for a carefully 
performed saddle point search via integration of (O. 

This article is organized as follows. We describe the saddle point search in detail 
in Section [2] below, and prove some of its properties, including the above condition 
on Nij(t), in Section |3] below. In Section 2] we introduce stopping criteria for the 
saddle point search, and in Section [5] we analyze these criteria. Section [5] contains 
proofs of some of the estimates in Section O In Section [3 we make some concluding 
remarks. 


2. Notation and saddle point search algorithm 


Here and throughout (W) is Brownian dynamics, that is, a stochastic process 
satisfying (EH). For simplicity we fix a single metastable set H and suppress 
the index i in all of our notations from the Introduction. For our purposes, V is 
smooth, and H is an (open) basin of attraction of V with respect to the gradient 
dynamics dy/dt = — W{y). We assume that is partitioned into finitely many 
disjoint (measurable) subsets, called pathways and labeled 1,2,...,A^, such that 
each pathway j contains a unique saddle point Sj of V. When (Xt) leaves H, it 
must exit through one of the pathways 1, 2,..., iV. 

The algorithm, as well as our analysis, depends heavily on the quasistationary 
distribution (QSD) for (Xt) in which we denote by p. The QSD is a prob¬ 
ability measure that is locally invariant for (Xt), in the sense that it is invariant 
conditionally on the event that (Xt) remains in H: 

Definition 2.1. The QSD for (Xt) in D, is a probability measure v supported in 12 
such that for all t > 0, 


i/(-) = F(Xt G • I Xo ^ p, Xg S 12 for all s G [0, t]). 


Of course v depends on 12, but for simplicity we do not indicate this explicitly. 
It has been shown m that p exists, is unique, and satisfies 

(2.1) v(A) = lim F(Xt G | Xg G A for s G [0, t]), for all A C 12. 


Moreover this convergence is exponentially fast, uniformly in A. Equation (12.111 
leads to simple algorithms for sampling p, based on the idea that a sample can be 
obtained from the endpoint of a trajectory of (Xt) that has remained in 12 for a 
sufficiently long time; see [5] for details. 

We are now ready to state the high temperature saddle point search algorithm. 
Versions of this algorithm have been used previously; see for instance [5] and ref¬ 
erences therein. The search runs at a user-specified “high” (inverse) temperature 
/3*'b Below we write p for the QSD in 12 at temperature /3 = /3’'b We also write 



for the Heaviside unit step function. 
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Algorithm 2.2. Set Nj{t) = 0 for t > 0 and j = 1,... ,N. Let M be the current 
cycle of the algorithm, and tsim the simulation clock. Initialize M = 1 and tsim = 0, 
and iterate the following: 

1. Generate a sample xm from v. The simulation clock tsim is stopped during 
this step. 

2. Starting at Aq = xm, evolve (Xt) at j3 = /3^‘ until it first leaves LI, say at 
time t = through pathway . The simulation clock tsim is running 
during this step, and the stopping criterion is continuously checked. If at 
some time tsim the criterion is met, the algorithm stops. 

3. If ll^l = j, update Nj{t) = Nj{t) + H(t — tsim) for t > 0 and record the 
saddle point Sj. Then update M = M 1. The simulation clock tsim is 
stopped during this step. 

It is not necessary to know N, and the pathways can be given labels according 
to the order in which they are found. The simulation clock is cumulative, and it 
only increases in Step 2. In particular, during the M-th cycle of the algorithm, tsim 
increases by . The stopping criterion will be described in Section |4l Below we 
write tsim for the final value of the simulation clock in the algorithm, that is, its 
value when the simulation is stopped. To refer to a generic simulation clock time 
we write t. Thus, 0 < t < tsim and when the algorithm stops, Nj(t) is the number 
of times an exit through pathway j has been observed by time t. Below we write 
Nj{t) for its final value when the algorithm stops. We will also use the following 
notations: 

N 

(2.2) x,(t) = liv,(t)>i, N{t) = ^iV,(t). 

That is, Xj{t) = 1 if an exit through pathway j has been observed at least once by 
time t, and is 0 otherwise; N{t) is the total number of exits observed by time t. 

3. Properties of the saddle point search 

Our hrst result follows immediately from properties of the QSD established 

m[II]. 

Theorem 3.1. Suppose that in Step 1 in the M-th cycle of Alaorithm \2.^ xm is 
a random variable with distribution v. Then: 

(i) is exponentially distributed with mean k~^ : > t) = exp(—Kt), 

(a) and are independent. 

Theorem 13.II then leads to the following. 

Theorem 3.2. Suppose that in Step 1 of Algorithm Xi,X 2 ,--. are iid with 
common distribution v. Then: 

(i) {.^(i)}o<t<tsim i^ 0 , Poisson process with parameter k, 

(a) independent Poisson processes with parameters 

(3.1) Kj := Kpj, pj := = j). 

Proof. Let {N{s))s>o be a Poisson process with parameter k, which we denote by 
N(s) for brevity. Label each arrival time of N(s) with a pathway j according to 
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the distribution pj, independently of the other arrival times, and let Nj(s) be the 
process with arrivals labeled by j. Then for r, s > 0 and mi,..., niN > 0, 


^ } j 



By summing over all rrii > 0 for i ^ j in the last expression above, we see that for 
fixed r, s > 0, the increment Nj{r + s) — Nj{r) is Poisson distributed with mean 
KpjS. Nj{s) also inherits independent increments from N{s). This shows that Nj{s) 
is a Poisson process with parameter kj = Kpj. Moreover, ( 13 . 21 ) shows that Nj{s), 
j = 1, ... ,N, are independent. 

Let us now relate {N{s))s>o with (./V(s))o<s<tsim- For fixed s G [0,tsim], the 
time marginal N{s) is the largest m such that + ... + < s. Together with 

part (i) of Theorem l3.ll this shows that on [0,fsim], (-^(s))o<s<tsim {N{s))s>o 

are Poisson processes with the same law. By part (ii) of Theorem 13.11 it follows 
that the multivariate processes (.^j(s))o<s<t’„ and have the same 

law. This establishes the result. □ 


4. Chill type estimators and stopping criteria 


The purpose of the high temperature saddle point search ('Algorithm 12 . 21 ) is to 
locate “enough” of the low-temperature rate corresponding to the metastable set fl. 
More precisely, at a low temperature corresponding to ,5 = /3'°, the first exit time of 
Xt from n is approximately exponentially distributed with mean (ki -|- ... -I- 
where kj = k}° is given by the Eyring-Kramers law (II.ip at /3 = (recall the 
subscript i has been suppressed). See [4] and references therein for rigorous results 
in this direction. The kj's are then exponential rates associated with leaving VL 
through pathway j at low temperature /3*°. The proportion of low temperature 
rate found by time t in Algorithm 12.21 is 


(4.1) 


R{t) := 




The expected value of R{t) is 


(4.2) 

where 


E[R{t)] = R{t) := 




(4.3) pj{t) := E[xj{t)] = 1 - exp{-Kjt). 

Here kj is defined as in Theorem l3.2l at temperature /3 = /3^‘. The idea behind Chill- 
type estimators is that when R{t) is sufficiently close to 1, the high temperature 
saddle point search can stop. There are two obstacles to this idea. 
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The first is that, at any time during Algorithm 12.21 it is unlikley that all saddle 
points have been found. This problem is remedied by replacing kj in (|4.1I1 with 
which is computable once pathway j has been found during the simulation. 
The second obstacle is that an exact formula ior pj{t) := E[xj(t)] will not be known 
in practice. Chill-type estimators overcome the latter obstacle by using one of the 
following approximations: 


(4.4) 

Pj{t) := 1 - exp[-fc'‘T], 
Pjit) := 1 - exp[-Nj{t)], 


fchi given by the Eyring-Kramers law at 


N,it) := 


N, it), N^{t)>2, 

O, else 




We have used the superscript to indicate that the rate in (14.411 is computed at 
temperature /3^‘ (whereas kj is computed at low temperature /3'°). Also note that 
Pj{t) is a physical estimate of E[xj(t)] based on Eyring-Kramers, while Pj{t) is a 
(biased) Monte Carlo estimator. Erom (I4.4|l we obtain the following estimators for 
R{t): 


(4.5) 


R{t) := 


J2f=iPj(.t)Xjit)kj 


m := 


J2j=iPjit)Xj{t)kj 

Ef=iX,W% 


R{t), R{t), and R{t) are all random, while R{t) is deterministic. Both R{t) and 
R{t) are explicitly computable at time t during the saddle point search. See [6] for 
further discussion of R{t). To our knowledge R{t) has not appeared before in the 
literature. We emphasize that R{t) may be used at any temperature /3*", while R{t) 
is limited by the fact that it gives reasonable estimates of R{t) only at (relatively 
low) temperatures where the Eyring-Kramers law holds. 

After choosing R{t) (resp. R{t)) as the preferred estimator, the stopping cri¬ 
terion can now be defined as follows: for a user-specified parameter e > 0, stop 
Algorithm 12.21 in Step 3 if and only if 


(4.6) 


R{t) > 1 — e (resp. R{t) > 1 — e). 


In Section [5] we give rigorous estimates of the bias and variance of the estimators 
R{t) and R{t). Such estimates will show that, as t increases, when the algorithm 
stops, on average at least (1 — e)% of the low temperature rate has been found. 


5. Analysis 

The approximation pj (<) oipj{t) is usually considered valid when <C V{sj) — 
V(m), with m the minimizer of V in fl. To the authors’ knowledge, rigorous results 
are scarce except when Sj = argmin,,^ ... sw ^(■Sj) ~ K(m); see [4] and references 
therein. However, the following is a consequence of results in [5]: 

Theorem 5.1. Suppose H = (a, b) is an interval and V is a Morse potential. Then 
for each t > 0, 

(5.1) = 1 + 0(l//3‘^‘) j = l,2. 

l-Pj{t) 

Proof. An examination of the proof of Theorem 4.1 of [5] shows that for j = 1,2, 
k^/Kj = 1 -h 0(l//3*'’) as —>■ oo. 
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where is as in (I4.4L and Kj is as in Theorem 13.21 at temperature /3 = The 
result follows. □ 


We next examine the approximation p{t) of p{t). 

Theorem 5.2. Conditionally on N(t) > 1, Nj{t) is an unbiased estimator for Kjt: 

(5.2) E[Nj{t)\N{t)>l] = Kjt. 

Also conditionally on N{t) > 1, Pj{t) is a conservative estimate of pj(t): 

(5.3) HPj{t) I Nj{t) > 1] < Pj{t). 

Proof. Recall that Nj{t) is a Poisson process with parameter Kj. Thus, 

E[N,{t) I N,{t) > 1] = (1 - 

71^2 

1 — n\ ^ 

n—1 

Since a; i—>■ 1 — e~^ is a concave function, the second statement of the theorem 
follows from Jensen’s inequality. □ 


The reason that we consider conditional expectations in Theorem 15.21 is that 
Algorithm 12.21 cannot stop before N{t) > 1. Thus, we want estimates conditioned 
on that event. We call Pj{t) a conservative estimate for Pj{t) because it is a lower 
bound on average, so that using Pj{t) in place of Pj{t) leads to a larger average 
stopping time for Algorithm 12.21 

Before proceeding we define, for a real-valued random variables X and Y, 

(5.4) Bias(W, Y) := E[A - T], MSE(A, Y) := Bias(X, Yf + Var(A). 

Observe that the mean square error is not symmetric in its arguments. 

Theorem 5.3. Write qj{t) = 1 — Pj{t) = exp[— and K = ki + ... + . For 

the estimator R{t), 


K 


Bias(R{t),R(t)) < A'max |Bias(p, (t),p, (t))| -I 

j j j 

Yai(R(t)) < 4—;- :prR{t)^ maxqjlt), 

(5.5) ^ 

MSE{R{t), R{t)) < 2N’^ maxMSE{pj{t),pj{t)) 


R{t) max qj{t), 
■j ^ 


H-;--TT ( 2 max qj(t) +4 ) R(tY max qj(t). 

min^ kj \ 3 ) 3 
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For the estimator R{t), 
(5.6) 


K 


Bias{R{t), R(t)) < A^max |Bias(pj(i),pj(i))| H-:- —R{t)maxqj{t), 

3 miiij kj j 


2K^ 


Var(ii(i)) < - ^R{tr mayiqRt) + I 1 + max g, (i) ) max Var(p, (t)), 

mmj fcj j ■' V j J j 


MSE{R{t), R{t)) <11 + + 2N^ maxgj(t) ] maxMSE(pj(<),pj(i)) 


+ 




miiij 


R{ty I 1 + inaxqj{t) I m.axqj(t). 


Here, all maxima and minima are taken over j € {1,..., N}. 
Proof. We give proofs in Section [5] below. 


□ 


We note that some of the bounds in Theorem 15.31 have been loosened so that 
simpler expressions are obtained. This will become clear in the derivation of the 
bounds in Section [5] below. We highlight that the bias is bounded by the bias 
of the estimate of Pj(t), together with another term representing an “inherent” 
bias associated with R{t). This second term may be approximated by noting that 
\R{t)\ < 1 for all t and, due to Theorem 15.11 we expect qj{t) can be estimated by 
the known function pj(t) or Pj{t). 


6. Estimates 

In this section we give a proof of Theorem 15.31 Recall that qj{t) := 1 —pj{t) and 
K := the total reaction rate. For brevity, we will sometimes suppress the 

t dependence in our expressions. Also, all sums are over 1,... ,7V unless otherwise 
indicated. 

6.1. Preliminary Calculations. Observe that 

Bias(i?(t), R(t)) = Bias(R(t), R{t)), MSE(R(t), R(t)) = MSE{R(t), R{t)) 

and similarly for this fact will be used below without comment. There are a 

few expressions that will show up repeatedly in the analyses of both R and R. We 
analyze them here for simplicity. Let 

(b.l) — ki P ^ ) kjjiXjji 

We make the following calculations: 

(6.2a) ki < < K 

(6.2b) E[^i] = ki + '^ Pmkm = AT - ^ qmkm 

A lower bound on this can be obtained from Jensen’s inequality, 

(6.3) E[C'] > Efe]-' = q^km 
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while an upper bound can be obtained from the Edmunson-Madansky inequality, 
,-ii ^ 1 K-E[^,] , 1 EK,]-fc, _ 1 1 


(6.4) < 


kj K — kj K K — kj K kiK 


^ ^ kmQm 


In the same way, 

(6.5) E[C"]>E[er" = 


m^i 


1 2 ^ , 

2 - ^ 9mfcr7 

m^i 


and 

(6.6) E[C^] < 


_2. IK- E[C,] , 1 E[^,] -h _ 1 K + h 


+ 


m^i 


Therefore, 

(6.7) Var(4-i) < 


K + k^ 2 


^ ^ Qmkm ^ ^ ^ ^m(0^m5 

m^i ^ m^i 


where we have lost some of the estimate in the last inequality for the sake of 
conciseness. 


6.2. Estimates for R. Below it is useful to notice that 


( 6 . 8 ) 


Uh+Y.m^,Xm{t)K 


= E 


PtXih 


6.2.1. Bias. We begin with the direct calculation 

N 

E[i7- i?] = ^E 


2=1 


XiPiki Xih 




K 


N 

-P*)E 

1 1 

•iss 

--2d 

1_1 

m 

+ 

N 

2=1 

1 1 

1_1 

N p 

+Ee 
2=1 •- 


XiPtkj _ Xrkj 


Kpi 


- 1 


K 

Pih 

K 


=bi 


Using (16.31) and (16.41) . 

1 

K 


kmqm - Qi <bi< j-y^ k 


mQm Qi' 


m^i 


m^i 


Thus, 


E , Piki 


N N 




kj ^ \ P^{t)h ^ Jfmaxj-gj-(t) 


i=i \i=i 


K 


miuj kj 


Combining the above expressions gives 

(6.9) Bias(.R(t), R{t)) < N max \pi{t) — Pi{t)\ + ^-^^^X—T-^R{t). 

i mini ki 
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R - E[i?] = XI ( ® 


2=1 


N 


6.2.2. Variance. For the variance, we first write 

N 

( 6 . 10 ) 

Hence, 

( 6 . 11 ) 


Var(i?(t)) = kikjPiPj Cov 

ij = l V_ 


Ptki. 


L. ^ 


Since Vij < y/^y/vJJ, it will be sufficient for us to analyze the diagonal terms. By 
Theorem 13.21 Xi and are independent. Thus 

(6.12) vu = E[x,]2 Var(e-') + E[CT Var(x,) + Var(C') Var(x,)- 
Using (EH) and (IQ) . 

vu < p^YaT{^~^) + p,q,E[^~'^] 

( I K + k, 

ImKm+Piqi^- 

m^i 


< Pi 


K + ki 2 ^ j 


K2 + fc2^2 


^ ^ Qmkn 


(6.13) 


m^i 


PiQi Api ^ ^Pi u 

^ m^i ^ ^ 

4 

pmaxgj(t). 


4pi 


< 


1 3 


We have made some sacrifices in the last inequalities in order to obtain a more 
concise expression. Consequently, 


N 


(6.14) 


Yai{R{t)) < X kikjPi{t)pj{t)y/v^y/v_ 

i ,3 = i 


33 


< 


mini ‘ 


lR{tY maxqi(t). 


6.2.3. MSB. Combining (16.91) and (16.141) . we then obtain 

MSE(.R(t), .R(t)) < 2A^^ max |pi(<) — pi(t)f 

(6-15) \ , 

H-^( 2maxgi(i) + 4 R{ty maxgj(i). 

min. kf \ i / i 

In this calculation, we see that the mean square error may ultimately be dominated 
by how well the Pi approximate the Pi . 


6.3. Estimates for R. We begin by noting that, since Pj{t) = 0 if XjY) Y !> 


m = X 

3 


Pj{t)kj 

L + Xmit)kj ■ 


(6.16) 
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6.3.1. Bias. We begin by writing 

N 


N 


( 6 . 17 ) 


i=i i=i ^ 


so that, after taking an expectation, 


(6.18) 

Hence, 


E[R- R]=J2^ 


2=1 


{Pi - Pi) 




N 


E ^ 


i=l 


K 

M. 


- 1 


K 


kiPi 

K ■ 


(6.19) Bias(i?(t), i?(t)) < A^max |Bias(pi(t),p(t))| H-;— —R{t)uia.^qi{t), 

i mini ki i 

and we see that the observed bias is controlled by the biases of the approximate 
probabilities, pi, and the inherent bias of the Chill type estimators. 


6.3.2. Variance. For the variance, we have 

N X. . s 

(6.20) Var(R) = kikj Cov ( ^ ) ■ 

' S7 sj ' 


7,1 = 1 


As before, we only need to study the diagonal entries, and use Theorem 13.21 to 
obtain 

Vii = E[p,f Var(4"^) + Var(pj) + Var(p,) Var(^"^) 

<Var(4-i)+E[CC]Var(p,) 


( 6 . 21 ) 


< 


min^ 


-2 maxgi + ( H-rly-a maxg^ ) Var(pi) 

7 \K^ mini 7 J 


< 


liui kf 


max qi 


1 


K'^ mini k'f * 


maxgi ) maxVar(pi). 


We note that these estimates require full independence of Nj{t) for j = 
not just independence of the Now, 

„ 2Ar^ - / \ 

(6.22) Var(.R(t)) < —;—-^i?(t)^ maxgi(t) + (1 + 2A^^ max 9 i(t)) max Var(pi(t)). 
mini kf 7 V i J i 


6.3.3. MSB. We can therefore express the mean square error of estimator R as 

4A:2 


(6.23) 


MSE(.R(t), i?(t)) < —;— -irRit)'^ fl + max( 7 i(t)') maxgi(t) 
mini kf Vi J i 


mini kf 

+ ^1 + + 2N^ maxgi(t)^ maxMSE(pi(t),pi(<)). 
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Figure 1. Comparison of the Chill type estimators R{t) and R{t) 
to the true expected proportion of the low temperature rate found, 
R{t), on a test system that can deviate from the Erying-Kramers 
law. 



t 



Figure 2. Comparison of the expected value of the Chill type 
estimators R{t) and R{t) to the true expected proportion of the 
low temperature rate found, R{t), on a test system that can deviate 
from the Erying-Kramers law. 


7. Discussion 

We have considered three Chill type estimators and shown them to be consistent. 
Their biases are small, relative to their variances, and thus we have good estimators 
of R{t), the true fraction of the observed rate in the system. They represent a 
significant improvement over the original AKMC stopping criterion presented in 
m- Indeed, these prior approaches attempted to estimate the fraction of the 
saddles observed when, in fact, it is the fraction of the observed rate that is of 
fundamental importance. 

As an example, we will compare the accuracy of both estimators using a test 
system that consists of saddle points sj corresponding to potential energy barriers 
V{sj) — V(m) = 1 -b for j = 0,..., 19. The test system has rates that obey a 
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modified Arrhenius equation with the form: 

(7.1) kf = gj exp[pV{sj) -V{m)]. 

Compare to equation (11.11) (recall the subscript i has been suppressed). The variable 
n controls how the rates deviate from an unmodified Arrhenius rate law. When 
n = 0 the modified rates are equal to the unmodified rates while when 
^hi ^ modified rates are larger (resp. smaller) than the unmodified rates 

if n > 0 (resp. n < 0). 

We use Algorithm 12.21 on the test system with modified rates k^\ This means 
are independent Poisson processes with parameters k^. To com¬ 
pute R{t), we use ()4.1|) and sample Xj{t) via (12.211 . To compute R{t) we use the 
unmodified Arrenius rates k^ in equation ()4.4I1 . For each of R{t), R{t) and R{t), 
the low temperature rates kj = fc*° used in equations dSD and (H3D are the same. 
We take gj = 1 for all j and /3*" = 2.5, /3*° = 10.0. The variable n was varied 
to compare the cases where the Erying-Kramers rates underestimate {n = i), 
overestimate in = — ^), and provide an exact estimate (n = 0) of the modified rates 
k^. Results are shown in Figures 1 and 2. The test system shows that R{t) can 
overestimate Rit) if the Eyring-Kramers rate deviates from the true rate at /3^‘, 
while R{t) tends to provide a conservative estimate of R{t). 
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